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Testing Like William the Conqueror: Cultural and Instrumental Uses of Examinations 
Abstract: The spread of academic testing for accountability purposes in multiple countries has 
obscured at least two historical purposes of academic testing: community ritual and management 
of the social structure. Testing for accountability is very different from the purpose of academic 
challenges one can identify in community “examinations” in 19th century North America, or 
exams’ controlling access to the civil service in Imperial China. Rather than testing for ritual or 
access to mobility, the modern uses of testing are much closer to the state-building project of a 
tax census, such as the Domesday Book of medieval Britain after the Norman Invasion, the social 
engineering projects described in James Scott's Seeing like a State (1998), or the “mapping the 
world” project that David Nye described in America as Second Creation (2004). This paper will 
explore both the instrumental and cultural differences among testing as ritual, testing as mobility 
control, and testing as state-building. 
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Resumen: En muchos paises la propagation de examenes academicos para fines de rendition de 
cuentas ha oscurecido al menos dos propositos historico de las esos examenes: rituales comunitarios 
y la administration de estructuras sociales. Los examenes academicos para fines de rendition de 
cuentas tienen propositos muy diferentes que los pueden ser identificado en “pruebas” comunitarias 
en el siglo 19 en America del Norte, o de examenes academicos que controlaban el acceso para los 
examenes del servicio civil en la China imperial. En lugar de pmebas ritualizadas o de acceso para la 
movilidad, los usos modernos de los examenes son mucho mas cercanos al proyecto de 
construction de un censo estatal de impuestos, como el Domesday Book de Gran Bretana medieval 
despues de la invasion normanda, los proyectos de ingenierfa social descrito por James Scott en 
Mirando almundo (1998), o “ mapeando elmtmdo ” proyecto que David Nye describe en America como 
segunda creacion (2004). Este artfculo explora tanto las diferencias instrumentales y culturales entre los 
examenes como rituales, como el control de los examenes de la motilidad, para la construction del 
Estado. 

Palabras clave: rendition de cuentas; pruebas; historia; expresion cultural; construction del Estado. 

Testando como William, o Conquistador: usos culturais e instrumentais de exames 
Resumo: A propagayao do teste academico para fins de prestayao de contas em varios paises tem 
obscurecido pelo menos dois fins historicos de testes academica: ritual da comunidade ea gestao da 
estmtura social. Testes para prestayao de contas e muito diferente do proposito de desafios 
academicos pode-se identificar em “exames” da comunidade no seculo 19 na America do Norte, ou 
o acesso de controle 'exames para o serviyo civil na China Imperial. Em vez de testar para o ritual ou 
acesso a mobilidade, os usos modernos de testes sao muito mais proximo do projecto de constru^ao 
do Estado de um censo fiscal, como Domesday Book de medieval Gra-Bretanha apos a invasao 
normanda, os projetos de engenharia social, descritos em James Scott vendo como um Estado 
(1998), ou o “mapeamento do mundo” projeto que David Nye descrito nos Estados Unidos como 
segunda cria^ao (2004). Este artigo ira explorar tanto as diferenyas instrumentais e culturais entre os 
testes como ritual, como testes de controle de mobilidade, e os ensaios, a construyao do Estado. 
Palavras-chave: prestayao de contas; testes; historia; expressao cultural; construyao do Estado. 

Introduction 

Today, in the public international discourse of education reform, both supporters and 
opponents of test-based accountability argue that the debate over accountability is an explicit 
morality tale. Advocates of test-based accountability in the United States and elsewhere argue that 
such accountability is required for human-capital development and to satisfy equity concerns. In the 
testing-policy advocates’ morality tale, opponents are often painted as self-interested “defenders of 
the status quo.” In contrast, Finnish educator Pasi Sahlberg (2011) has argued that a global 
educational reform movement (or GERM) is an engulfing army that invades various countries, an 
invasion force that replaces alternative models of education as a public good. In Sahlberg’s morality 
tale, Finland and some other outposts are lonely, brave opponents of an educational colonialism. 

A morality-tale approach to accountability policy obscures important alternative perspectives 
on the discourse. This article focuses on both the instrumental and cultural uses of testing in a range 
of countries and periods. From a broader view, one can see three dominant uses of testing which 
present as either instrumental or cultural in orientation. Flowever, though advocates of the three 
dominant uses have focused on either cultural or instmmental goals, in reality any use has both 
cultural and instrumental uses. For the purposes of this perspective, I broadly define “testing” as the 
formal assessment of examinee (commonly student) skills and achievement constructed and 
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evaluated by non-examinees. In defining testing in this way, I borrow John Calhoun’s (1973) 
description of intelligence as audience-defined: assessment of students can be formal or informal, 
conducted by outsiders or by students themselves (as children who play competitive games can well 
attest). But if we consider testing as a formal outsider-driven process, then the formal testing 
administered by adults tells us a great deal about the purposes of schooling from adult, outsider 
perspectives. Doing so also avoids restricting discussion to eras and educational regimes that have 
access to particular technologies such as standardized, paper-and-pencil tests. 

The Historical Purposes of Testing 

At least at a first approximation, public testing has served at least one of three purposes: 
cultural ritual, status gatekeeping, or state-building. 1 2 Across multiple cultures and eras, the social 
repertoire of formal testing has served multiple purposes, but within a few regular patterns. Broadly 
speaking, those regular patterns circle around social affinity, selection, and cohesion. In one manner, 
the different uses of testing are a remarkable statement about the flexible purposes of the same types 
of behavior. On the other hand, the regularity of uses is an indication that testing serves a fairly 
stable social repertoire. The examples used here are included to demonstrate the varied uses of 
testing. 

Ritual 

An example of testing as cultural ritual is the end-of-session public examination in many 
early 19th century village (“district”) schools in the U.S. North. As explained by William Reese 
(2013), end-of-session exams were community events where adults witnessed both the teacher and 
the pupils reciting texts, answering questions, and otherwise engaged in ritualized performances. 
Long before Helen and Robert Lynd (1929) described Indiana small-town basketball as community 
glue, and long before Mary Metz (1991) explained how cultural scripts of high school imposed 
institutional isomorphism, American schools commonly followed a testing script that was focused 
on community and ritual. This common ritual was not universal: certainly, many male students beat 
teachers into a hasty retreat from a town in the 19th century, but it was common enough to be a 
recognized element in American schooling in the early 19th century. This ritual lives on in the 
Scripps National Spelling Bee and similar competitions. 

Gatekeeping 

A second purpose of testing is as a status gatekeeper, filtering applicants for positions in a 
social structure. Mazzeo (2001) calls this role the guidance function of testing when it occurs inside 
school structures, though the same function can also exist as a filter lying between levels of formal 
schooling, such as university entrance examinations. To think clearly about the role of testing, one 
can focus on external relationships between testing and the broader social structure. Two prominent 
examples of this purpose are the civil service exam in Imperial China and the constmction of IQ 
exams in the early 20th century in the United States. For hundreds of years, a formal examination 
served as a minimum gatekeeper for civil service status in China. The importance of the examination 


1 The complementary perspective, by children, would be well worth the effort, probably best conducted by 
anthropologists. 

2 Other uses of testing are not considered here. Testing for individual diagnostic purposes and testing for 
formative assessment (or similar instructional-guidance purposes) are recent phenomena—they may be able 
to be incorporated in the analysis here, but they are outside the scope of this paper. 
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lasted through a number of regime changes, became a prominent part of Chinese social rigidity, and 
provoked the creation and maintenance of a system of private tutoring of applicants. In the United 
States, the importation and translation of so-called intelligence tests in the 20th century served as 
one tool by which urban public school systems divided students into curricular tracks or streams. 
Consistently associated with race, class, and gender divides, IQ-based tracking decisions limited 
educational opportunities for millions in the first half of the 20th century. In both imperial China 
and 20th century North America, a significant purpose of testing was to serve as a status gatekeeper. 

State-building 

A third purpose of testing is to serve as part of state capacity-building, which is also 
commonly referred to as state-building. As Reese (2013) explains, early advocates of standardized 
testing in American common schools thought it could be used to put pressure on schools to 
improve, either directly or indirectly (the latter through competitive exams to enter secondary 
schools). More recently, advocates of test-based accountability have encouraged the development of 
additional state capacity to manage both the process of testing and the analysis, dissemination, and 
use of those test results to manage schools and school personnel. It is important to understand the 
parallels between test-based accountability and the types of state-building projects that James Scott 
(1998) describes in Seeing like a State. While Scott focuses on modern state-building and instrumental 
(and largely failed) utopianism, the activity of data collection is much broader and older. The 
constmction of the Domesday Book on the orders of William the Conqueror is an example of the type 
of state-building data collection that predates both industrialization and the modern nation-state. 
Without a rigid definition of state-building, it is sufficient for now to note how accounting activities 
serve both central government functions such as tax collection and also provide an intimate 
connection between central governments and the everyday life of people. 

The Cultural Uses of Testing 

While testing as ritual, testing as gatekeeper, and testing as state apparatus appear to be very 
different in character, in some ways one can describe them all as having both cultural uses and also 
instmmental uses. The cultural use of testing exists even when testing is not used for ritualistic 
purposes. That stems from the fact that instmmental functions are generally tied to particular 
discourses around education, and also the ways in which the experience of being tested generates 
common touchpoints for cultural expression. Certain school rituals may be bounded by time and 
place, and it may be a matter of some irony that more standardized, authority-driven, and common 
experiences can be the genesis or inspiration of more cultural expression than less standardized 
school practices. But that is a feature of the ideological role of authority, not an accident. As David 
Nye observed (2004), the act of surveying townships in the young United States was as much an 
assertion that the United States was outside history, remaking North America, as the surveying was a 
practical act of governance. Measuring and using power is not just a matter of “seeing like a state” 
(Scott, 1998) but defining and communicating like a state as well. Those acts invariably become part 
of the cultural history of examination, and this section describes three cultural uses of testing. 
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Affirmation of Community 

As described above, the use of testing for ritualistic purposes in schooling may take the form 
of public performances, in which case one cultural use of such testing is to affirm the worth and 
nature of the community. Consider the ritualistic school performances in the young United States. 

In enforcing a public performance that invited interaction, such public “examination” validated the 
connection between schooling and the local adults, the power of local adults over the schoolmaster, 
and the experience of pupils during the school session. Such validation could also exist with the 
discourses surrounding testing for gatekeeping or testing for state authority. In the case of 
gatekeeping, the discourse of testing and tracking has often involved an affirmation of community 
by exclusion: with admissions tests for selective secondary schools and universities, for example, the 
students within the school are identified as being both worthy of the school’s education and 
simultaneously a member of the select community of the school. More generally, the discourse of 
testing as gatekeeping across time and place has often revolved around meritocratic ideologies—if 
not the meritocracy of an entire society, at least the competence, merit, and virtue of those who pass 
the tests and become the members of the Chinese imperial bureaucracy, the graduates of English 
universities, or members of professions with limited access. 

While the discourse of testing as gatekeeping emphasizes the community of successful 
examinees, the discourse of testing for state-building emphasizes the community of citizens—in 
human-capital rhetoric, contributing adults or children presumed to be future contributing adults. 
One variant of this discourse of community is the argument for high-stakes testing for equity 
purposes: in this case, the role of testing is not to build citizens but to guarantee the rights bestowed 
by citizenship. One final role of citizenship tied to testing-as-state-building involves the citizen as 
taxpayer: what are taxes purchasing when they are spent on schooling? In the United States, this 
discourse began no later than the mid-1960s and became a prominent reason for the construction of 
state assessment systems in the 1970s (e.g., Dorn, 2007). 

Normalization 

A second cultural purpose of testing has been to establish normative boundaries, beyond 
the definition of a community. The setting of cultural norms is a messy, complicated process, and it 
may help to think of testing as a common experience that both prepares and can be used for setting 
expectations. For ritual exams, those expectations may have been minimal (can one recite a poem?), 
but they need not be. In contrast with the normalization of performance standards provided by 
ritual examinations, testing for gatekeeping has frequently generated a discourse around the 
normalization of the social structure, either the existing structure or a desired meritocratic structure 
of the future. As Dreeben (1968) argued, one historical role of schools has been to strip away the 
veneer of self-esteem provided by students’ families, teaching students that their personal worth is to 
be judged by a narrow range of competencies. Kliebard (1986) describes an ideology of social 
efficiency tied to IQ testing and tracking in the early 20th century in the United States. But not all 
testing for gatekeeping purposes has been tied to justifying the existing social stmcture. Lemann 
(2000) argued that the growing use of the SAT for college admissions in the mid-20th century 
United States served the interests of elite university administrators who wanted to generate a 
national student market. It is important to note that in Lemann’s view, the vision of elite university 
administrators was not one of broad access to elite education but equal opportunity, and a 
reconfirmation of the status of elite universities for a modern economy. 

In addition, the discourse around testing for accountability has used and fed into norms 
about schooling. In a classic case of policy feedback, former Florida Governor Jeb Bush pushed for 
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a policy in 1999 where the state labeled each local public school with a letter grade, A through F, 
akin to the North American letter grades assigned pupils in school subjects. These so-called school 
grades have been based largely on student performance on mandated state tests, and the relative 
generosity of A and B grades applied to primary schools encouraged principals of those schools to 
publicly declare themselves as an “A school” or “B school” using roadside signs. Education 
reporters have casually used the grades as adjectives in like manner, including “failing school” as a 
term applied to schools with a state label of F. In generating this discourse, the A-F labeling policy 
normalizes test-based accountability as part of the “real school” cultural script (Metz, 1991). In 
doing so, the A-F labeling policy is more politically robust than an alternative labeling scheme would 
have been. 1 1n recent years, many other states and some cities have tried to adopt Florida’s policy. 
Because the other jurisdictions have been less generous with the higher labels, it is uncertain whether 
they will be as politically robust. 3 4 5 

Cultural Referent 

A third manner in which testing has cultural uses is when the experience of testing has 
become a touchpoint for cultural expression. The Scripps National Spelling Bee in the U.S. became 
the subject of the documentary Spellbound (Blitz, 2002). Often enough, spelling contests among 
adults (not children) have been the fodder for fiction, as in the “Spelling-School” in Edward 
Eggleston’s Hoosier Schoolmaster { 1871) or spelling contests at “Literaries” in Laura Ingalls Wilder’s 
fictionalized stories from her childhood. Currently, the Groot Dictee der Nederlandse Taal is a 
televised dictation challenge for adults in the Low Countries that has mn since 1991. In children’s 
literature, examination rituals are often transformed: The “Spelling Bee” is a character (not an event) 
in Norton Juster’s child fantasy The Phantom Tollbooth (1961), and English-style ‘O’ and ‘A’ level 
exams became O.W.L. (Ordinary Wizarding Level) and N.E.W.T. (Nastily Exhausting Wizard Test) 
in J. K. Rowling’s Harry Potter series. The suspense involved in examinations has worked its way 
into fiction when the tests have a gatekeeping or state-building function: in the movie Stand and 
Deliver (1988), the climax comes when math teacher Jaime Escalante’s students retake an Advanced 
Placement exam after they were accused of cheating because their scores were too good in 
comparison with stereotyped expectations of Latino students." 1 In contrast with the serious treatment 
of AP exams in Stand and Deliver, Wu Ching-tzu's eighteenth-century novel The Scholars satirized the 
imperial Chinese civil service exam, poking fun at perennial candidates standing for the exam, the 
network of private tutors benefitting from the existence of the exam, and the elaborate warping of 
social prestige conferred by exam passage (see, e.g., Miyazaki, 1976). Ching-tzu’s novel is a long 
form of satire that was common in the later imperial period; P'u Sung-ling had written a number of 
shorter satires of the examination system around 1700 (Elman, 2000). 

In all of these ways, testing as an activity and examination systems as social phenomena are 
generators or objects of cultural interpretation. The definition of community by either inclusion or 
exclusion is a cultural use of testing that is inscriptive in nature, delineating boundaries and qualities 
at a local level. More imposing is the use of testing for the creation or reformation of social norms. 
Definition and norming are both generative activities, while using test experiences as a springboard 
for cultural creation is the use of testing as an object of expression, of arguments about the tests and 
test systems themselves. It is important to note the ways in which apparently “functional” purposes 


3 Since the Republican party in Florida has held onto the governor’s seat and both houses of the legislature 
since 1999, that political stability has also ensured continuity of the labeling policy. 

4 In fall 2014, New York City’s chancellor announced that city’s abolition of A-F school grades. 

5 They passed the second time, too. 
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of testing can still become part of cultural expression, either in the discourse surrounding the 
function or as an object of commentary and social criticism. 

Instrumental Uses of Testing 

If one side of examinations is the cultural use of testing, then the other side is a functional 
use of testing, one associated more with blunt application of instrumental power than with 
discourse. The division between the two is somewhat arbitrary—many would see the normative 
purpose described above as instrumental. The distinction is still helpful to illustrate the broader 
point that testing is a dual-use activity in terms of both application of power and interaction with 
cultural expression. Again, this dual use is true whether one is discussing the apparently-instrumental 
purposes of gatekeeping or state authority, on the one hand, or the softer use of testing for ritual 
purposes, on the other. To illustrate these dynamics, this section discusses four instrumental uses of 
testing: sorting at the micro or macro level, intervening in the lives of a school, enforcing a 
curriculum, and enforcing a language. 

Sorting 

One function of testing is to sort. The sorting can be for purposes of social prestige, the type 
of informal ranking that is loosely defined but important at a local level, or for formal access (or 
denial of access) to opportunities in jobs or further education. Examinations for ritual purposes have 
also served as a performative delineation of social status or basis for judgment of prestige by others. 
The gatekeeping purpose of testing explicitly sorts examinees as individuals. Beyond the sorting of 
individuals, the gatekeeping role of testing also sorts opportunities for society more broadly. Testing 
is thus an implicit method to shape the social structure, including social classes. European countries 
with examination-based secondary tracking systems have done so implicitly for several decades, 
approximately one half-century after North American schools expanded tracking along with the 
expansion of high schools around the turn of the twentieth century. Examination-based entry into 
professions is a more explicit form of shaping social structure at least as far as advantaged and 
prestigious occupations are concerned, whether the imperial civil service or more modern-era 
professions such as medicine and law. Whether justified in softer terms such as guidance (see 
Mazzeo, 2001), in firmer terms of implied fixed ability, or in the slippery definition of deviance (e.g., 
Johnson, 1968), gatekeeping assessment systems in a range of times and places have become part of 
the repertoire for generating and regenerating a social structure. 

While one may think of test-based sorting in terms of its effects on examinees, one 
important legacy of sorting students is the dual use of such mechanisms, with the capacity to sort 
and rank their teachers as well. Even at the level of informal sorting of social prestige, this type of 
function can be levied as much against a teacher as students, as in a music teacher’s ensemble as an 
indicator of her or his value among other music educators. But even before the recent development 
of formal test-based accountability systems, testing has been used as a form of sorting for teachers. 
In England, Payment by Results long predated the post-World War 2 interest in so-called merit pay 
(e.g., Rapple, 1994). Such inclination to judge and sort teachers predated Payment by Results: as 
early high schools in North America often used blindly-graded entrance exams, those results became 
one potential filter by which local grammar-school principals were judged if they applied to teach at 
Philadelphia Central High School (Labaree, 1988). 
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Intervening 

A second instmmental use of testing has been in deciding to intervene in local school 
matters, whether at the level of the child, the classroom, or the school. While teachers have based 
informal instructional judgments on classroom assessments of performance for many decades in 
different societies, one finds it difficult to draw a clear distinction between the use of such judgment 
for intervention in contrast with sorting. In Joseph Lancaster’s monitorial school model in the early 
19th century, frequent testing would decide on the role of pupils within a class as well as placement 
in recitation groups (e.g., Kaestle, 1973). Lancaster’s use of assessment looks much closer to sorting 
than deliberative instructional adjustment. Closer to the mark is the practice of retaining or 
promoting students between curriculum levels, using examinations to do so. In the United States, 
such actions could be seen beginning in late nineteenth-century cities with the evolution of age¬ 
grading. While the research on grade retention is mixed at best, there is a plausible interventionist 
argument to make about retention in grade, or (less common) advancement in grades and tracks or 
streams in the middle of school years. 6 

As with sorting, the use of testing for interventions extends beyond the examinees to 
schools and employees. One source of such pressures has been the school system itself as an 
organization, where test-based accountability and so-called league tables can create political pressure 
on school administrators to make significant changes in schools, with or without explicit mandates 
such as those embodied in No Child Left Behind in the United States. Another source of such 
pressures has been parents. As Labaree (1997) notes, the private uses of schools includes the role of 
schools in promoting economic advancement by students and their families. Preparing students for 
examinations is thus a frequent expectation of families with significant advantages where those 
examinations serve gatekeeper purposes. While preparation for exams can be private such as post- 
World War 2 juku or cram schools in Japan, parents have also made judgments about public schools 
from the use of both entrance exams in postwar university admissions and also examination-based 
credit for college such as the United States Advanced Placement system. In recent decades, for 
example, wealthy or middle-class parents have held increasing expectations that high schools would 
offer Advanced Placement courses, thus preparing students for exams that can provide some college 
credit. In this context, parental pressure on schools to offer a certain curriculum can be seen as a 
form of intervention. 

Enforcing a Curriculum 

A third instrumental use of testing has been in the enforcement of what examinees should 
learn. For example, for an early state commissioner of education in the U.S., Henry Barnard, 
entrance exams to high schools would “operate as a powerful and abiding stimulus to exertion 
throughout all the lower schools” even if only a minority of students attended high school (Barnard, 
1865, p. 283). Though competitive admissions exams served a gatekeeping purpose on the surface, 
Barnard thought they could help with the control of primary and intermediate schools. The use of 
exams to enforce a minimal curriculum has become much more transparent in the past half-century, 
as central state curriculum planning has envisioned a tightly-linked connection between mandated 
curriculum, testing, and accountability. Professional licensure exams have encouraged the 
standardization not only of formal, large schooling, but also private test preparation—examinations 


6 As indicated earlier, I will not be discussing more recent research on the use of formative assessment, in part 
because of that recency and in addition because it is uncertain how extensive such practices are (e.g., Dorn, 
2010 ). 
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for new lawyers in some cultures can prompt the type of preparation and private tutoring that is as 
intense as formal legal classes. The putative purpose of such licensure exams is to guarantee a 
minimal level of professional knowledge for lawyers, medical personnel, those who work on 
electrical and plumbing systems, and so forth, and it is somewhat reassuring to know that a licensed 
electrician has to demonstrate at least a minimal knowledge of electrical regulatory code to acquire 
the right to wire a building. 

It is important to understand that the enforcement of a curriculum is not just the domain of 
apparently-functional purposes for testing. In ritual examinations, the norms have involved both the 
standards of performance and the implied curriculum—in early federal America, recitation and other 
acts of memorization were validated by end-of-session public examination. To take a modern 
example, musical performance festivals for school ensembles are both a demonstration of 
performance (and implied leadership of music teachers) and also a confirmation of canonical 
material judged to be appropriate. The organization of such festivals has required securing rotating 
sites on school grounds and publicity of the results, and they have thus also been an affirmation of 
the role of music education in the curriculum in an era when arts education in the United States is 
unvalidated by state-mandated assessments. 

Enforcing a National Language 

At the intersection of curriculum and national identity is the role of schools in generating or 
enforcing language. In multiple contexts, the content and language of examinations has played a role 
in both the cultural authority of a language (e.g., spelling or dictation challenges) and the relationship 
between local communities and new nations. In some countries such as France or the United States, 
the presumed dominance of a desired single language is still an object of enforcement. In 
postcolonial nations, the management of multiple native and legacy colonial languages have been 
both a logistical and a cultural object of enforcement through schools. In countries as varied as 
Zambia and Papua New Guinea, the language of instruction and examination has been a matter of 
conscious policy, and the overlay of colonial examination systems, with the language of 
examinations, have on occasion become embedded in the internal status contests of the new nation/ 

This classification of instrumental uses of testing is highly tentative, focusing on selected 
uses that stretch across multiple times, places, and explicit purposes of examinations. This analysis 
excludes formative assessment because of its recent use and theorizing. The broader argument 
remains: the explicit dominant purpose of a testing regime does not eliminate either the cultural uses 
of testing or the instrumental uses. In the practices described here, the uses are generated from each 
type of testing. The discourse and instrumental uses of test-based accountability thus are not the 
unique or entirely new phenomenon that Sahlberg identified as his GERM. Rather, the history of 
examination for a range of purposes included both instrumental and cultural outlets. We may be 
experiencing a wave of testing that is organized as state control in a new way, but both the use of 
power and the cultural expressions surrounding test-based accountability is old. One can find 
schools in many countries asserting that they have accountability for the postindustrial global 
economy, but they remain close to testing like William the Conqueror. 


7 This essay does not analyze the discursive role of cultural authority bodies such as L'Academie franchise for 
French or advocates such as Noah Webster, a language nationalist for the young United States. 

8 For an example of the practical considerations, see Delpit’s (1995) description of education policy in newly- 
independent Papua New Guinea. Serpell (1993) uses the postcolonial experiences of Zambia to explore the 
social-cultural and psychological meaning of schooling, including language considerations. 
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Why Has State-Building Dominated Recently? 

One remaining substantive question is why the use of testing for state-building has 
dominated recently—if Sahlberg’s putative Global Educational Reform Movement is not as unique 
in some ways as Sahlberg claims, why can he plausibly claim it as a unique phenomenon? Certainly, 
there is currently a great quantity of test-based accountability, accountability activities organized by 
central states. One tentative explanation is that the modern use of testing for state-building has been 
amenable to broad coalition-building, with policies justified by the uses of schools for human capital 
development, for mobility access, for modernization of a poorer state, for paternal approaches to 
childhood and poverty, for social control, and sometimes even to address concerns about inequality. 
If the advantage of testing for state-building is its utility for adding different purposes for 
justification, then the uses of testing for state-building should happen early and more quickly in 
societies where there is a potential desire or demand for state-building—societies with existing but 
messy central states. Despite anti-state ideology in the United States, that description fits the 
circumstances surrounding the adoption of No Child Left Behind in the United States in 2001, 
where policymakers in both national parties had an interest in centralizing control of education and 
where states and the federal government were willing to borrow authority from each other in pursuit 
of mutual (state-building) ends (Manna, 2006). 

A search for stability can explain the uses of testing and related pressures in industrial 
subregions in the People’s Republic of China, where economic transitions have put enormous 
stresses on the regime’s legitimacy. (For a different interpretation of Chinese educational structures, 
see Zhao, 2014.) It is important to understand the role of such purposes in authoritarian regimes 
where the ability to manage dissent without coercion is as critical as explicit uses of state power. 
State-building through testing is thus a potentially appealing mechanism in authoritarian regimes in 
critical transitions. The ubiquity of testing as a state activity is relatively new. As such, it is an 
example of institutional copycat behavior—what DiMaggio and Powell (1983) described as 
institutional isomorphism. For different types of regime, instmmental uses of testing serve state¬ 
building purposes. 
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